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Abstract — We analyze the second moment of the ripple size 
during the LT decoding process and prove that the standard 
deviation of the ripple size for an LT-code with length k is of 
the order of vfe. Together with a result by Karp et. al stating 
that the expectation of the ripple size is of the order of k |3|, 
this gives bounds on the error probability of the LT decoder. We 
also give an analytic expression for the variance of the ripple size 
up to terms of constant order, and refine the expression in |3| 
for the expectation of the ripple size up to terms of the order of 
1 jk, thus providing a first step towards an analytic finite-length 
analysis of LT decoding. 

I. Introduction 

We assume the reader is familiar with Fountain codes, LT- 
codes and belief propagation (BP) decoding. For details, the 
reader is referred to [Q], Q. 

We consider LT-codes with parameters (k, D.(x)), where k is 
the message length and D.(x) = J] Q.iX l is the degree distribu- 
tion of the output symbols during encoding. An important set 
to consider is the set of output symbols of degree 1 (the ripple). 
The size of the ripple varies during the decoding process, 
as high-degree output symbols become of degree 1 after the 
removal of their edges, and as ripple elements become useless 
after the recovering of their unique neighbor. 

The decoding is in error if and only if the ripple becomes 
empty before all the input symbols are recovered. A natural 
question is thus whether we can track the size of the ripple, 
in the expectation, during the decoding process. Karp et 
al. J3) proved that the expected ripple size is linear in k 
throughout most of the decoding process. Their asymptotic 
analytic expressions for the expected ripple size can be found 
in section [TTJ They also derive an expression for the expected 
cloud size throughout decoding, where the cloud is defined 
at each decoding step as the set of output symbols of degree 
strictly higher than 1. 

In this paper, we extend their analysis in two ways. First, we 
consider higher moments of the cloud and ripple size in order 
to upper bound the error probability of the LT decoder. More 
specifically, we use similar methods to derive an expression for 
the variance of the ripple size and prove that it is also linear in 
k throughout most of the decoding process. We can then use 
this expression together with the expression for the expectation 
to offer a guarantee for successful decoding, as follows: if, for 
fixed LT-code parameters, R{u) is the expectation and ctr(m) 
is the standard deviation of the ripple size when u symbols 
are unrecovered, then if the function 



for some parameter c never takes negative values, we can 
upper bound the error probability of the LT decoder by the 
probability that the ripple size deviates from its mean by more 
than c standard deviations. 

Second, we take the first step towards an analytic finite- 
length analysis of the LT decoder, by providing exact expres- 
sions for the expectation (variance) of the ripple size up to 
0(l/k) (constant) terms. This is done by considering lower- 
order terms in the difference equations, but also by getting 
tight bounds on the discrepancy introduced by approximating 
difference equations by differential equations. 

It is worthy to note that the expressions we deal with are 
valid for "most of the decoding process," that is, the analysis 
breaks down when the number of unrecovered symbols is no 
longer a constant fraction of k. This is no issue, however, 
when one considers Raptor codes, which need only a constant 
fraction of the input symbols to be recovered by the LT 
decoder Q. 

II. Preliminaries - an expression for the expected 
ripple size 

Let u be the number of unrecovered (undecoded) input 
symbols at a given decoding step. Define the decoder to be in 
state (c, r, u) if the cloud size is c and the ripple size is r at 
this decoding step. To each state (c, r, u), we can associate the 
probability p c ,r,u of the decoder being in this state. Define the 
state generating function of the LT decoder when u symbols 
are undecoded as 

Pu{x,y)= ^2 p c ,r,v,x c y r ~ 1 . 

c>{).r>l 

The following theorem by Karp et al. gives a recursion for 
the state generating function of the LT decoder. 

Theorem 1: Q Suppose that the original code has k input 
symbols and that n — k(l + S) output symbols have been 
collected for decoding. Further, denote by fi i5 i = 2,...,D, 
the probability that an output symbol is of degree i, where D 
is the maximum degree of an output symbol. Then we have 
for u = k + 1, k, . . . , 1 



P u -i(x,y) 



h c (u) = R(u) — c ■ a R (u) 



(1) 



P u ( x(l -p u ) + yp u , - +y ( 1 - - 
u \ u 



Pu I X(l — Pu), - 

u 



(2) 



where for u < k. 
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and 



and pk+i := Hi. Further, := i°. 

This recursion gives a way to compute the probability of a 
decoding error at each step of the BP decoding as 

P Brr {u) = }^Pc,Q,u = 1 — 2j Po,r,u = 1 — -P«(l, 1)) 
c>0 c>0,r>l 

and the overall error probability of the decoder as 

k 

Perr — ^ ^ Perri^) • 
u = l 

If we approximate the LT process by allowing output 
symbols to choose their neighbors with replacement during 
encoding, p u becomes: 



1 /it 



where 



/(*) 



fe 2ff \k 



xQ,"(l - x) 



and g(x) 



/(*) 



i - xQ'(i -x) - n(i - x) 

With this assumption, Karp et al. use the recursion to derive 
difference equations for the expected size of the ripple and 
the cloud, and further approximate these difference equations 
by differential equations that they solve to get closed-form 
expressions for the expected ripple and cloud size. Formally, 
let R(u) denote the expected number of output symbols in 
the ripple, and C(u) denote the expected number of output 
symbols in the cloud, when u input symbols are undecoded, 
where u is assumed to be a constant fraction of the total 
number of input symbols k. Then the following theorem 
shows that R(u) is linear in k for an appropriate choice of 
the LT code parameters. 

Theorem 2: [3] Consider an LT-code with parameters 
(k, Cl(x)) and assume n = (l+e)fe symbols have been collected 
for decoding. During BP decoding, let C{u) and R(u) be 
respectively the expected size of the cloud and ripple as a 
function of the number u of undecoded input symbols. Then, 
under the assumptions that u is a constant fraction of k and 
fii > 0, we have 

C(u) = n(l- - u/fc) - 12(1 - u/k)^ +0(1) 

R(u) = (l + e)u m'(l-u/*0+ i^e ln |) 



In what follows, we let C (x) be a continuous approximation 
of C(u/k) := C{u)/n, a normalized version of C{u). C(x) can 
be shown to be the solution of the differential equation 

C\x) = f(x)C(x) 

with initial condition 

(7(1) = C(l) = (1 - Qi) (1 - (1 - fii)"" 1 ) , 

and is given by 

C{x) = c (1 - xQ'{l -x)- n(l - a;)) , 

with co = 1- (1 - fii) n_1 . 

Similarly, we define R(x) as a continuous approximation of 
R(u/k) := R(u)/n. R(x) is the solution of 

R'(x 



^-- Co xn"(l-x) + T ±- 
x 1 + e 



1 - (1 - fii 



with initial condition 

R(l) = R(l) = 

ib 

and is given by 

R(x) — x ( cqQ! (1 — x) H — lna; + 

V 1 + e 

n-i l-(l-Oi) 



(3) 



with 



(4) 



ro = f2i(l — fii 

n 

Then we can write 

C{u) = nC{u/k) + 0{l) 
R{u) = nR{u/k) + 0(l). 

III. An Expression for the Variance of the Ripple 

Size 

Let crfj(w) be the variance of the ripple size as a function 
of the number of undecoded symbols u. In what follows we 
will always assume that it is a constant fraction of k. a 2 R (u) 
is given by 

\2 



cr(u) = ^ (r - l) 2 p c ,r,u - R(i 

c>0,r>l 

= N(u) - R{u) 2 + R(u), 

where we define 

d 2 p u 



N(u) := 



-(1,1)= Y, (r - l) 2 p c , r ,v - R(u) 



(5) 



(6) 



It is thus enough to find an expression for N(u) to get an 
expression for a%(u). We start by differentiating both sides 
of the recursion Q twice with respect to y and evaluating at 
(1, 1). This gives us a recursion for N(u) : 

N(u - 1) = ^1 - N(u) - 2 Pu C(u) -2(1--} R(u) 



+ P 

- 2 



^(1, 1) + 2p u 1 



u / dxdy 



(1.1) 



-P„(l,l)+P u 



(7) 



Before we can proceed with solving this difference equation, 
we need to find expressions for the second-order derivatives 
1) and §^(1, 1)- We do so by following exactly the 
same method that we are currently outlining for an expression 
for N(u). Define 



M(u) := 
L(u) := 



(PPu 

dx 2 

o 2 p u 

dxdy 



(1,1) 



1,1) 



Let M(x) be a continuous approximation of the normalized 
function M(u/k) := M{u)/n 2 . It can be shown that M(x) is 
the solution of the differential equation 



M'{x) = 2f(x)M(x) 



with initial condition 



1 



M(x = 1) = I 1 - ~J (1 - Q.iY (1 - (1 - fii) 

and is given by the expression 

M(x) = m (1 - xQ.'(l -x)- 0(1 - x)f 

with 



mo = 1 - 



(l-(l-fii 



Similarly, let L(x) be a continuous approximation of L(u/k) := 
L{u)/n 2 . It is the solution of 

L'(x) =(- + f(x)) L(x) - f(x)M{x) + j^-C(x) 
\x J 1 + e 

with initial condition 

L(x = l) = M - - fij), 

and an expression for it is 

L(x) = x (1 - - k) - {1(1 - »)) 



with 



^mofi'(l — a;) + - ^ lna; + Zo^j 



1- i ) fii(l-fii)"- 2 . 

71 / 



Then the following theorem gives closed-form expressions 
for M(u) and L(u). 



Theorem 3: 

M(u) = n 2 M(u/k) + 0(k) 
L(u) = n 2 L(u/k) + 0(k). 

As for the "dirt" term 

1 



-2 



-P u (l,l) + P„ l-p„ 



(8) 



it does not involve derivatives and we cannot use the same 
method to find an expression for it independant the state 
generating function. However, we can bound it under an 
assumption on the ripple size. More specifically, it is not 
difficult to prove that for r > 3, the dirt term is of constant 



order. In what follows, we assume that the size of the ripple 
does not go below the constant 3. 

Replacing M (u) and L(u) by their expressions and bounding 
the dirt term in the recursion (FT), we obtain the following 
difference equation for N(u) : 



N(u) - N(u - 1) = ( - - \ J N{u) - P 2 u M(u) 
- 2p u (l - -) L(u) + 2p u C{u) 



(9) 



+ 2 1- 



R(u) + 0(1). 



Note that N(u) as defined in equation (|6]l can be as large 
as a constant fraction of k 2 . We thus need to normalize N{u) 
if we want to say something meaningful about the difference 
N(u) — N(u — 1). We define x := u/k to be the fraction of 
undecoded symbols, and let N(x) :— N(u)/n 2 be a normalized 
version of N(u). We similarly normalize the other functions 
of u and represent them as functions of x: 

M{x) := M(u)/n 2 , L{x) := L{u)/n 2 , 
C(x) := C{u)/n , R{x) := R{u)/n. 

Normalizing equation (|9| and replacing the functions 
M(x), L (x), C{x) and R{x) by their continuous approxima- 
tions, we obtain 

Nix) - Nix -l/k)= -^-Nix) - \ f(x)L(x) 
kx k ' 

+ JY ^- k R(x) + Oil/k 2 ). 

Neglecting lower-order terms, we approximate N(x) by the 
function N(x) which satisfies 

N(x) - N(x - 1/fe) = ^N(x) - lf(x)L(x) + jY^Rix)., 
kx k (1 + e)k 

with initial condition Nil) = N(l). 

Claim 1: For any x on which N{x) is defined, N(x) and 
N(x) differ by a term of the order of 1/k. 

We skip the proof of this and subsequent claims for reasons 
of space, and refer the reader to the final version of this paper. 

We further approximate the discrete function N(x) by the 
continuous function N(x), and 

N(x)- N(x-l/k) 
l/k 

by the first-order derivative of N{x). N{x) satisfies the differ- 
ential equation 

N'{x) = ^N{x)-2f{x)L{x) + ^-R{x) (10) 

with initial condition N(l) = N(l). 

Claim 2: For any x on which N(x) is defined, N(x) and 
N(x) differ by a term of the order of l/k. 



The general solution of the differential equation ( 10 1 is 
given by 

n(x) = x 2 (m n'{i - xf + 2i n'(i - x)+ 

— - a;) In a; + -^-lnx + 1 ,„ (lnx) 2 + n Y 
1 + e 1 + e (1 + ey J 

(11) 

where the value of the constant no can be found to be, by the 
initial conditions, 

n = - (l - - no n - 2 n? - 2 ni + \ (i - (i - «i) n ) . 

By claims [T] and [2] we thus have 

N(x) = iV(T) + C>(l/fc), 



where N(x) is given by equation (Hi. This gives us an 
expression for N(u), up to a term of the order of fc: 

N(u) = (1 + e)V [ - u/kf + r- 2 — n'(l - u/Jfe) In ^ 
V 1 + e fc 

+ (T^K) 2 ) + W 



Comparing this expression to that for R(u) 2 given by equations 
Q and Q, it is easy to see that these two expressions agree 
up to terms of the order of k, so that the variance of the ripple 
size 

o 2 R {u) = N(u) - R{u) 2 + R{u) 
is of the order of k. 

Theorem 4: Consider an LT-code with parameters (k, tl(x)) 
and let a R (u) be the standard deviation of the ripple size 
throughout BP decoding. Then 

a R (u) = 0(Vk). 

IV. Toward a Finite-Length Analysis of the LT 
Decoder 

Our ultimate goal is to be able to bound the error probability 
of the decoder as a function of k, without the assumption that 
k goes to infinity. We thus need to find an expression for the 
variance of the ripple size, instead of simply determining its 
order. For this purpose, we must find an expression for N(u) 
up to terms of constant order, and an expression for R(u) up to 
terms of the order of 1/fc. We illustrate the analysis for N(u). 
From the recursion given by equation (|7j, we proceed by first, 
assuming that the ripple size does not go below 4 so that the 
"dirt" term is of the order of 1/fc; and second, replacing C(u), 
R(u), M(u), and L(u) by finer approximations as follows: 

C(u) = nC(u/k) - nD c (u/k) 
R(u) = nR(u/k) - nD R (u/k) 
M{u) = n 2 M{u/k) - n 2 D M (u/k) 
L(u) = n 2 L(u/k) — n 2 DL(u/k), 



(12) 



where Dc(x) is a discrepancy term introduced by approximat- 
ing C(u) by C(u), and D R (x), Dm{x) and Dl{x) are defined 



similarly. These discrepancy terms are all of the order of 1/fc 
and are given by the following expressions. 



D c (x) 
D R {x) 
D M (x) 
D L {x) 



fe(l— x)— 1 fe(l— x}— 1 

h £ a n (i-f)+° ( i/fc 2 ) 

i=0 i=J+l 
k(l-x)-l k(l-x)-l 

i=0 J=i + 1 

fc(l— x) — 1 fc(l— 

h E ^ n ( 1 -?)+^ 2 : 

fc(l_a!)-l fc(l-x)-l . . 

pE^n (i-|)+o(i/fc 2 ) 



i=i+i 



where d,Ri,Mi, Li and c 



L are constants for most of 



the decoding process and are given by 

d = C"(l-i/k) - g(l-i/k)C(l-i/k) 
Cj = f(l-j/k) 



Ri 



R"(l - i/k) + g(l - i/k)C{\ - i/k) 
+kf(l-i/k)D c (l-i/k) 



Mi 
m , 



1 - j/k 

M"(l - i/k) - (2g(l - i/k) + f(l - i/k) 2 ) M(l - i/fc) 
2/(1 - i/fc) 



Li = L"(l - i/k) - 2g(l - i/k)L(l - i/k) 

+ (g{l - i/k) + /(l - i/k) 2 ) Mil - i/k) 
+kf(l-i/k)D M (l~i/k) 

- i/k)C(l - i/k) - ^-Dc(l - i/k) 



U 



1 - j/k 



+ f(l-j/k). 



These expressions are obtained by the same method that we 
are now following to obtain a more precise approximation of 

N(u). 

The next step is to write a recursion for N(x) which is exact 
up to terms of the order of 1/fc 3 . We then approximate N(x) 
by N(x) which satisfies the same recursion except that we 
neglect terms of the order of 1/fc 3 : 



N(x) — N(x — 1/fc) = 



kx k 2 x 2 



N(x) - f(x) 2 M(x) 



- [ "f /(*) + ^9(x)) L{x) + lf(x)D L (x) 



+ 



r f(x)C(x) + 



(l + e)fc 2 

2 2 
(l + e)k DR ^ ~ (l + 6) 2 fc 2 ' 



(l + e)fe (l + e)fc 2 a; 



R(x) 



Claim 3: For any x on which N{x) is defined, N(x) and 
N(x) differ by a term of the order of 1/fc 2 . 

We further approximate N(x) by N(x) which satisfies the 



differential equation (lOi and is given by expression (111. A 
more careful analysis of the discrepancy beween N(x) and 
N(x) leads to the following claim: 

Claim 4: For any x on which N(x) is defined, N(x) and 
N(x) differ by a term of the order of 1/k. 
More precisely, 



N(x)-N(x) = D N {x), 



where 

D N (x) 



1 



k(l-x)-l 

E 



N"(l-i/k) 



(1-i/fc) 



- /(I - i/k) 2 M(l - i/k) + 4^(1 - i/k)L[l - i/Ar) 
+ 2fc/(l - i/k)D L (l - i/k) + 2f( ;]~ l ( k ' ) C(l - i/k) 



R(l - i/k) 



(l + e)(l-i/fc) 

fc(l-a;)-l 

n " 



(l + c) 
-^-D H (l_i/fc) 



2 



fc(l-j/fc) 



+ 0(l/fc 2 



By claims [3] and [4] we thus have 



N(x) = iV(x) - D N {x) + 0(l/fc 2 ), 



where N(x) is given by equation (111. Using the resulting 
expression for N(u), and the expression for R(u) given by 



equation (12i, we finally get an expression for the variance 
of the ripple size up to terms of constant order. 

Theorem 5: Consider an LT-code with parameters (k, tl(x)) 
and overhead e and let cr%(u) be the variance of the ripple size 
throughout BP decoding. Then 



i u 
ln k 



-(i+ e )^(V(i-vfc) 2 + I ^ 

+ (1 + e)u - u/k) + ln 

+ nD R (u/k)^ 



2fii 



1 + 2 



- nD N (u/k) + 0(l). 

Figure [T] shows a plot of the expected ripple size and the 
functions and h 2 (u) given by equation ([T]), throughout 

the decoding process, for an LT-code with k = 800 and e = 0.1, 
and with the "Capped Soliton" degree distribution 



50 ' ?-^i=2 i(i-X) 



1 V- 1 

X + > rX 

50 



i(i-l) L 

inspired from Luby's Ideal Soliton distribution [1|. The plot 
also shows the result of real simulations of this code, and 
confirms that the problem zones of the decoder are those 
predicted by the functions hi(u): the closer they are to the 
a;-axis, the more probable it is that the decoder fails. As can 
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Fig. 1. Ripple size expectation and standard deviation versus the fraction of 
decoded input symbols. The black line is the empirical failure probability of 
the decoder based on 100 million simulations. It confirms that the "problem 
zones" of the decoder are the ones predicted by the second moment method. 



be seen, there is a fair chance that the decoder fails when the 
fraction of decoded input symbols is between and 0.2, and 
there is a very good chance that the decoder fails when the 
fraction of decoded input symbols is close to 0.95. 

V. Conclusion 

We have given an analytic expression for the variance of 
the ripple size throughout the LT decoding process. This 
expression is asymptotically of the order of k, and we have 
expressed it as a function of k as a first step toward finite- 
length analysis of the LT decoding. The next step is to work 
around the assumption that it is a "constant fraction" of k. 
Then we would obtain a guarantee for successful decoding 
as a function of the LT-code parameters and overhead for 
practical values of k. This would then allow us to solve 
the corresponding design problem, namely to choose degree 
distributions that would make the function h c (u) stay positive 
for as large a value of c as possible, for a fixed code length k. 
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